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ABSTRACT 

The masses of star clusters range over seven decades, from ten up to one hundred million solar masses. 
Remarkably, clusters with masses in the range 10 4 M© to 1O 6 M show no systematic variation of radius with 
mass. However, recent observations have shown that clusters with M c \ > 3 x 10 6 M Q do show an increase in 
size with increasing mass. We point out that clusters with M c i > 10 M Q were optically thick to far infrared 
radiation when they formed, and explore the hypothesis that the size of clusters with M c i >3x 10 6 M Q is set by 
a balance between accretion powered radiation pressure and gravity when the clusters formed, yielding a mass- 
radius relation r c i ~ O.3(M c //lO 6 M0) 3//5 pc. We show that the Jeans mass in optically thick objects increases 
systematically with cluster mass. We argue, by assuming that the break in the stellar initial mass function is 
set by the Jeans mass, that optically thick clusters are born with top heavy initial mass functions; it follows 
that they are over-luminous compared to optically thin clusters when young, and have a higher mass to light 
ratio Yy = M c i/Ly when older than ~ 1 Gyr. Old, optically thick clusters have Ty ~ M^ 1-0 3 . It follows that 
Ly ~ c' 3 , where a is the cluster velocity dispersion, and 3.6 < j3 < 4.5. It appears that Ty is an increasing 
function of cluster mass for compact clusters and ultra-compact dwarf galaxies. We show that this is unlikely 
to be due to the presence of non-baryonic dark matter, by comparing clusters to Milky Way satellite galaxies, 
which are dark matter dominated. The satellite galaxies appear to have a fixed mass inside a fiducial radius, 
M(r = ro) = const.. 

Subject headings: galaxies: star clusters — stars: mass function 
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1. INTRODUCTION 

Most stars in the Milky Way are believ ed to have formed 
in star clust e rs, e.g ., lLada & Ladal ( 1 19951) . In their review 
IClarke et all (120001) note that 96% of stars in the Orion B 
cloud are in clusters, while 50 - 80% of stars in Orion A 
are. The latter fr action is consistent with th at found by 
iGomez et all d 1993b in Tarus. lAllen et al.l (120071) suggest that 
only ~ 20-25% of stars form outside clusters. 

There is independent evidence, based on simulations of 
field binary star synthesis rather than on star counts in star 
forming region s, that the majority of stars formed in clusters 
(Kroupal ll995l) . This result applies to stars formed over the 
lifetime of the Milky Way disk. The similarity between the 
stellar initial mass function (IMF) inferred from field stars and 
that measured in young clusters implies that most low mass 
stars form in clusters, and hence along side massive stars, 
rather than in an isolated mode, at least in the Milky Way. 

Observations of nearby galaxies suggest that a significant 
fraction of stars form in cluste rs in these galaxies as well. For 
example. iMeurer et al.l (119951) find that 20% of the UV light 
in seven starburst galaxies they surveyed comes from young 
star clusters. Similarly, the cluster fraction of the B band ligh t 
in the galaxy m erger NGC 3256 found bv lZepf et aT] (1 19991) 
was 19%, while iFall etail (120051) found that 20% of the total 
Ha emission in the Antennae galaxies comes from clusters. 

In the case of these external galaxies, the estimates of 
the cluster fraction (~ 20%) are likely to be lower limits, a 
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point n oted by IFall et al.1 d2005l) . For example, IMeurer et al.1 
(1995) estimated that the cluster fraction of young stars in 
NGC 5253 was - 14% since that fraction of the 2200 A UV 
light came from clusters. However, more recent radio and 
infrared observations of NGC 5253 have revealed the pres- 
ence of a deeply embedded star clu ster, with a luminosity 
~4 x 10 42 erg s" 1 dTurner et alj|2003[). and a correspondin gly 
large ionizing flux Q « 7 x lO^ s" 1 dTurner & Beckll2004 in- 
dicating an age of order 1 Myrs or less. This is 43% of the 
bolom etric luminosity of the galaxy, given by Gorii anet al.1 
(2001) as 9.2 x 10 42 erg s" 1 , showing that at least 43% of the 
star formation in this galaxy occurs in a single cluster. This 
extremely luminous cluster was not detected in the UV by 
Meureretal. (1995]). It seems likely that much of the clus- 
tered star formation in starburst galaxies is similarly obscured, 
so tha t the clustered star formation estimate of Meur er et al.1 
( 1995) is significantly low. 

Observations of young star clusters find a power law mass 
distribution of the form 



dN d (m) 
dm 



= Ni)in 



(1) 



with 1 .5 < a < 2 in bo t h the Milky Way and in oth er galaxies 
dKennicuttet al.lll989t IMcKee & Williamslll997l) . This im- 
plies that most massive stars are found in the few most mas- 
sive clusters. If one believes that massive stars are relevant 
to galaxy formation, understanding star formation in massive 
clusters is crucial. 

In this paper we show that clusters with initial M c \ > 3 x 
10 6 M Q were likely supported by radiation pressure just before 
and during the time their stars were forming; it follows that 



such clusters will exhibit a mass radius relation r, 
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unlike globular clusters, which have radii independent of their 
masses (by cluster radius we mean the projected radius that 
encloses half of the cluster light). 

In addition, we argue that star formation in massive (M c i > 
10 6 M Q ) clusters will produce an IMF with a larger character- 
istic mass than star formation in less massive clusters. The 
IMF in the M i lky Way is describe d by a broken power law 
(Kroupa 2001; Mue nch et al.ll2 002). or by a log-normal dis- 
tribution lMnM!&Sca3l97^7^imilarly, the IMF in young 
Milky Way clusters such as Ori on is described by a broken 
power, e.g., dMuench et al.ll2002h . simil ar to that of the Milky 
Way as a whole. While the classic Salpeter] d 1955 ) IMF 
has only an upper and lower mass cutoff, these more recent 
power-law based estimates of the IMF involve at least one ad- 
ditional characteristic mass. In either the log-normal or pow- 
erlaw models, the characteristic mass in the Milky Way is of 
order 0.5M Q . 

The origin of the cha racteristic mass is d i sputed in the lit- 
erature. For example, Adams & Fatuzzo (1996) state that 
". . . the Jeans mass has virtuall y nothing to do with the m asses 
of forming stars". In contrast, Mc Kee & Ostrikeri (120071) state 
"A recurrent theme in star-formation theory is that the char- 
acteristic mass-defined by the peak of the IMF-is the Jeans 
mass at some preferred density." We argue that the second 
point of view is correct. 

The gas in the interstellar medium is not smoothly dis- 
tributed; a substantial fraction is found in the form of gi- 
ant (tens of parsec size and 10 6 M Q mass, in the Milky 
Way) molecular clouds, which contain parsec scale clumps, 
which in turn contain s ub-parsec scale cores; see, e.g., 
iMcKee & Ostrikerl(l2007l) : lBergin & Tafallal(l2007l) . The den- 
sity increases as one moves down the size and mass scale. The 
masses typically follow a power law distribution similar in 
form to that of massive star clusters (eqn. Qjabove). This be- 
havior is consistent with the notion that the density and mass 
distributions are controlled by supersonic turbulence. Both 
analytic theory and simulations predict log-normal density 
distributions in the gas density and gas surface density of su- 
personic turbulent flows; observations find surface densities 
that are consistent with log-normal distributions. 

Measured velocity differences are scale dependent, again 
consistent with turbulent motions; they are large on large 
scales, but decrease with decreasing scale. The length scale 
on which the velocity is equal to the thermal sound speed is 
called the sonic length. Below this scale turbulent motions 
cannot compress the gas, so that the density is no longer con- 
trolled by turbulence; rather, thermal or magnetic pressure, 
along with the self-gravity of the gas, will control the density. 
In the Milky way the sonic length is ~ 0.03 pc. Star forming 
cores are typically about this size or smaller. 

We assume that on scales below the sonic length the frag- 
mentation properties are controlled by the thermal properties 
of the gas. For an ideal gas equation of state P = pkT //i ~ p 7 , 
where 7 is the effective adiabatic index of the gas. The Jeans 
mass scales as r 3//2 /p I//2 ~ p( 3 "M)/ 2 . If 7 < 4/3 an increase 
in the gas density (as, for example, when the clump is self- 
gravitating) leads to a decrease in the Jeans mass and the like- 
lihood of further fragmentation. When 7 > 4/3 the Jeans 
mass is an increasing function of density, and so in the ab- 
sence of other physics the gas will not fragment. For an 
isothermal equation of state 7=1, while for an adiabatic equa- 
tion of state 7 = 5/3. 



A hard lower b ound to the characteristic mass is give n by 
the opacity limit (lLow & Lynden-BelllfT976l; iReesll 19761) . ex- 
pressed by the condition that the accretion luminosity of a 
contracting sphere of gas not exceed the black body luminos- 
ity of a sphere of the same radius. The critical mass in this 
case is just the Jeans mass, at the radius and density when the 
sphere becomes optically thick. This bound applies whether 
the protostellar core is formed by turbulence and transient, 
or if it self-gravitating. In either case the luminosity is given 
roughly by L ~ v 5 /G; the velocity is v ~ \jGMjr if the clump 
is contracting, and larger if it is transient. 

Other theories associate the characteristic mass with the 
Jeans mass at some other density, e.g., with the density at 
which the gas first b e comes thermally co upled to dust grains 
(lLarsonlll973"l 120051; Uappsen et al1l2005l) . The argument is 
that the gas is nearly but not quite isothermal; at low density 
the temperature decreases with increasing density, but above 
the critical density temperature begins to increase with in- 
creasing density. This is modeled as a change in the effective 
adiabatic index 7 of the gas, from less t han unity to s lightly 
more than unity; numerical experiments (iLi et al.ll2003l) indi- 
cate that gas fragments readily when 7 < 1, but less readily 
for 7 > 1 . 

iBonnell et al.1 12006) perform numerical experiments, using 
an isothermal equation of state, that show that the mass of the 
break in the IMF is equal to the initial Jeans mass, and in- 
creases with increasing initial Jeans mass. When they employ 
a Larson style equation of state, they find a fixed characteristic 
mass in the resulting IMF, corresponding to the density where 
the adiabatic index suddenly increases from 7 < 1 to 7 > 1 . 

Co mpetitive accretion (IZinneckerl 119821: IBonnell et al.l 
|2001|) schemes also predict that the characteristic mass is the 
mean Jeans mass, i.e., the Jeans mass usi ng the mean density 
of the star forming r egion, and T w 10 K (KlessenetaD[l998; 
IBonnell et alll2006l) . 

Yet other theories for the origin of the IMF start from 
the notion described above, that the observed fragmenta- 
tion in giant molecular clo uds arises from turbulence, e.g., 
iPadoan & Nordlundl d2002l) . The turbulence establishes a 
powerlaw in clump mass at the high mass end, with an in- 
dex similar to the Salpeter values. However, even in this 
work the characteristic mass is related to the Jeans mass 
(calculated using the mean density) divided by the Alfvenic 
or fast magnetosoni c Mac h number Aip to the 2/3 power 
IPadoan & Nordlundl d2002l) , their equation 30. 

Elmegree n et al.l d2008) note that numerical simulations 
consistently show a proportionality between the characteris- 
tic mass and the thermal Jeans mass, and point out that the 
observed constancy of the characteristic mass then requires a 
constant Jeans mass, despite variations in the environments 
where stars from. We argue below that under conditions 
found in ultraluminous infrared galaxies (ULIRGs) and other 
massive galaxies neither the Jeans mass nor the characteristic 
mass is like that found in the Milky Way. 

This paper is organized as follows. In §|2]we show how the 
Jeans mass Mj varies with cluster mass. When the star form- 
ing clumps have r c \ ~ 1 pc andM f / > 10 6 M Q , the finite optical 
depth to the far infrared radiation released by the contraction 
of the protocluster gas or by the dissipation of turbulent mo- 
tion leads to an increase in temperature, and, we show, in the 
Jeans mass. The role of radiation pressure in setting the ra- 
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dius of very massive clusters is described in ^3] In ^4] we 
compare our results to observations of massive young clus- 
ters, of ultra-compact dw arfs (UCDs) and of c entral massive 
object s such as those in lGeha et al.l d2002l) and Walch eret al.l 
(2005). We give a short discussion of our results in the con- 
text of previous models for very massive star clusters in $5] 
and offer our conclusions in §|6] In the appendices we dis- 
cuss the log-normal distribution, the initial mass function (or 
IMF) that we use and the calculation of the associated light to 
mass ratio, we outline the calculation of the mass-radius rela- 
tion for a radiation pressure supported cluster, and finally we 
discuss the dark matter density on lOpc scales predicted by 
recent numerical simulations. 



2. PROTOCLUSTER FRAGMENTATION AND THE INITIAL MASS 
FUNCTION 

Observations of embedded clusters in the Milky Way show 
that the star formation efficiency (SFE), t he fraction of cluster 
gas that ends up in stars, is 10-30% dLada & Ladal 120031) 
for clusters with stellar mass ~ 10-1000M Q . In contrast 
to this well established result, the time scale over which the 
gas is converted into stars is nearly as contentious as the the 
origin of t he characteristic m ass; it is either the dynamical 
time, e.g., [Elmegreen (2000), or a few to se veral dynamical 
times, e.g.. iTan. Krumholz & M cKee (2006). In appendix lAl 
we show that the combination of relatively high SFE and short 
star formation timescale implies that the initial density of the 
gas that ends up in stars is no more than a factor p m /p ~ 10 
larger than the mean density p of the cluster for rapid star for- 
mation; it follows that the Jeans mass of this gas is no less 
than Mjj < Mj >~ 1/3, where (Mj) is the Jeans mass cal- 
culated using the mean cluster density. For a more extended 
star-formation time, say 5 dynamical times, and the minimum 
SFE of 10%, this ratio may be as low as 1 /6. Both these es- 
timates are likely to be low compared to reality, since some 
stars undoubtedly form in less dense but more extended self- 
gravitating sub-clumps. 

The point here is that, while log-normal distributions are 
broader than gaussians, in the present context they are still 
rather narrow, so the gas destined for star formation cannot 
come from too far out on the tail of the density distribution. 

Observations of magnetic fields, via Zeeman s plitting in 
OH m asers, in both the Milky Way, with B~3 mG (IFish et al.1 
2003), and in n earby ultraluminous i nfrared galaxies, also 
with B ~ 3 mG dRobishaw et al.ll2008l) indi cate that the fast 
magnetosonic Mach number Mf < 5. Thompso n et al.1 
(2006) argue that in ULIRGs the volume average field is also 
~ 1 mG, so that the Mach number is globally of the same or- 
der. 

Simulations suggest th at p m /p depends only weakly o n 
Mf of the turbulence (lOstriker Stone & Gammid 1200 lb . 
Since the Mach number does not vary over a large range, and 
assuming that turbulence in the ISM of galaxies is adequately 
modeled by recent simulations, it follows that the mean Jeans 
mass of the cluster is a good proxy for the mean Jeans mass 
of the star forming gas in clusters. 

Henceforth we will assume that the characteristic mass is in 
fact the Jeans mass of the proto-stellar gas, 



where 




(3) 



is the Jeans length, c 2 = kT / pm p is the sound speed, and (f>j 
is a dimensionless constant of order unity, accounting for the 
difference between the mean density of the cluster and the 
mean density of that fraction of the cluster gas that ends up 
forming stars. 

In a proto-cluster clump of gas, the mean density is given 

by 



3M C , 



(4) 



where M c i and r c i are the initial gas mass and radius. 

We note that it is likely that both the initial M c \ and r c i 
differ from the final cluster mass and radius. For example, 
we noted that the SFE was of order 10-30%. If the un- 
used gas is expelled on a dynamical ti me or longer, the clus- 
ter will expand by a factor ~ 1 / SFE ( Hills] 1 1980t) from its 
original radius. This result is confir med by numerical simu- 
lations (Baumgardt & Kroupal (120071) . see their Fig. 4). The 
simulations show that if the gas is expelled on a shorter time 
scale, the cluster will expand by a somewhat larger factor, 
or it may even be destroye d, if the SFE is less than 30%. 
iBaumgardt & Kroupal (120071) show that the final cluster radius 
will actually be smaller than the initial radius if the cluster is 
subject to strong tidal forces from its host galaxy. 

It is not clear that a cluster with M c ; = 10 6 M Q and r c \ = 3pc 
will have SFE as low as ps 0.3. Such a cluster has an escape 
velocity of order 50km s -1 , well above the sound speed of 
photoionized gas, so any HII region that is formed will be 
dynamically irrelevant. To see this, note that the dynamical 
pressure in the cluster is 



P«7rG£ 2 «8x 10" b gcm" 



-v 2 . 



(5) 
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Meanwhile, the mean particle number density is n ps 3 x 
10 5 cm" 3 , so the pressure of ionized gas is Pun = 4 x 
10~ 7 gcm _1 s" 2 , dynamically negligible. 

Nor can stellar winds expel the cluster gas; the bolomet- 
ric luminosity of 10 6 M Q of zero age main sequence stars 
is roughly L = 6 x 10 42 ergs _1 , while the wind kinetic lu- 
minosity is L w w 2 x 10 40 ergs _1 . The cooling rate of the 
wind is L coo \ = Ari^V ~ 10 42 ergs _1 , where V is the volume 
of the cluster and n% w 6 x 10 3 cm" 3 is found by assuming 
the shocked stellar wind has a pressure comparable to the 
dynamical pressure. In other words, the wind luminosity is 
not sufficient to produce a pressure high enough to support 
the weight of the overlying gas. Alternately, the cooling time 
Tcooi = kT / Knh w2x 10 10 s is much shorter than the dynam- 
ical time Tdyn = r/v «3x 10 12 s; the wind cools before it can 
push the overlying cold gas out of the cluster. 

Supernovae will not remove much gas from the cluster ei- 
ther; the binding energy of the cluster is 3 x 10 52 erg, about 
30 times the kinetic energy supplied by a single supernova. 
While one expects multiple supernovae to explode, they do 
not explode all at once. For typical initial mass functions 
the supernova luminosity is L$n rs 10 40 erg s" 1 , similar to the 
wind luminosity. Thus supernovae deposit a total energy 
LsNTdyn ~ 2 x 10 50 erg in a dynamical time, not sufficient to 
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FIG. 1 . — T he lockup fraction a(mi,z/lQ = 0.2;?) for the IMF given in 
equation (B2i and several values of tn\, as labeled. The vertical dotted line is 
at 12 Gyrs. 

unbind the cluster (of course the gas cools in less than a dy- 
namical time). 

This argument applies to young clusters, with a substantial 
fraction of their mass in the form of gas. However, for clusters 
in which the gas fraction is very small, gas removal may be 
very efficient. For example, in our 1O 6 M cluster, if the gas 
fraction is reduced to 1/30 of the total mass (perhaps because 
the rest of the mass has been put into stars), the binding energy 
of the gas is reduced to ~ 10 51 erg. In that case, a single su- 
pernova may be able to eject the gas. Some such mechanism 
appears to operate in Milky Way globular clusters, as inferred 
via the following argument: as stars evolve off the main se- 
quence, they lose their envelopes. The mass loss rate from 
this process would lead to substantial amounts of intercluster 
gas accumulating, of order 100 - lOOOM^ be tween passages 
through the plane of the galaxy (Roberts 1988j). Observations 
of the intercluster medium in globular clusters dFreire et al.l 
1200 ll) show conclusively that this gas does not remain in the 
cluster, resulting in a net decrease in cluster mass. This reduc- 
tion in cluster mass is quantified in the so-called lockup frac- 
tion a(t), the fraction of mass locked up in pr esent day star s 
and stellar remnants; see, e.g., chapter 7.3 of Pagel (119971) . 
For a Muench et al. IMF with mi = O.6M , a(\2Gyrs) w 0.45, 
see Figure [T] In our context, a(t) is a very rough lower limit 
to e. 

The Jeans mass associated with a cluster of mass M c i is set 
by the temperature T and the radius r c /. We examine the Jeans 
mass in three different cases; 1) when the cluster is either opti- 
cally thin to far infrared radiation (FIR), or of sufficiently low 
accretion luminosity that the effective temperature is ~ 10K; 
2) when the cluster is optically thick and T e jj >> 10K, but 
the cluster is not radiation pressure supported, and 3) when 
the protocluster gas is radiation pressure supported by the ac- 
cretion generated luminosity. 

To be quantitative, we need to pick a particu l ar IM F. We 
will use a modified version of thelMuench et al. (2002} IMF, 



described in appendix IB"1 and quantified in equation iB2i . We 
consider two modifications to the Muench et al. IMF. The first 
consists of ignoring the small mass bump (between masses of 
0.025 and O.O17M ), extending the 0.73 power law down to 
the minimum mass we consider, usually = O.1M . The 
second modification is that we use a maximum stellar mass 
of Mjj = 12OM . Since we ignore the small-mass bump, there 
are two free parameters (in addition to thl and mjj) in our IMF, 
the two break masses m\ and ni2 < /«i . Muench et al. find that 
in the Orion Nebular Cluster, m\ = O.6M and m 2 = 0. 12M Q . 
In this paper we assume that the larger mass is in fact the Jeans 
mass, mi =Mj. 

2.1. Optically thin protoclusters 

When the protocluster is optically thin to far infrared radia- 
tion, as is the case for almost all clusters in the Milky way at 
the present epoch, the temperature is very nearly independent 
of cluster radius (or alternately, density ); the gas temperature 
is observed to be around 10-2 0A" e.g., IClemens & Barvainisl 
(ll988l) : lRafhborne et alJ (l2006h . The Jeans mass is then only 
dependent on the cluster density. If the mass-radius relation in 

1/3 

young Milky Way clusters is taken to be r c i ~ M J , whi ch is 
consistent with the clusters listed in Lada & Lada (20031), the 
Jeans mass is the same in all current Milky Way protoclusters, 
and hence (under our assumptions) so is the IMF. This is con- 
sistent with observations of the IMF in the Milky Way, except 
possibly in the galactic center. 

For example, consider the best s tudied young cluste r, the 
Trapezium-ONC cluster in Orion. iHillenbrandl (1 19971) lists 
1576 stars, of which 973 are given a probability larger than 
0.5 of being members; Hillenbrand gives a lower limit to 
the stellar mass of 9OOM , which she estimates is about 
half of the actual mass. Half the stars lie within 0.72 pc 
of the Trapezium, assum ing a distance of 400 pc to Orion 
dKharchenko et al.l 120 05). The extinction in our direction is 
small, A v < 2.5, while the surface density of stars is cur- 
rently S = 18OOM /1.7pc 2 = 0.2gcm" 2 , corresponding to 
N H ~ 10 23 cm" 3 and A,, « 70. 

As noted above, the star formation efficien cy of embed- 
ded c lusters in this mass range is of order 30% dLada & Ladal 
120031) . This suggests that the protocluster which formed the 
ONC had a mass ~ 3 times larger than the current stellar 
mass. The original dynamical time of the protocluster was 
R/v ^ 10 5 (r?/0.7pc)(4kms~ 1 /v)yr, or less, as the original ra- 
dius was likely smaller than the current radius. 

If the gas was removed when it was ionized by the central O 
stars, its outflow velocity might have been of order 10km s~ , 
(the sound speed c s 13 km s -1 ) but the pressure of the ion- 
ized gas has to overcome the weight of the bulk of the cluster 
gas. The pressure of the gas was 7rG£ 2 10~ 8 dynes cm" 2 , 
while the pressure of the ionized gas is 

p Hu = 4^? kT « 10-9 (i^M 1/2 

3/2 dynes cm" 2 , 

where Q is the number of ionizing photons emerging from the 
O stars per second, and a rec is the recombination coefficient; 
we assume photoionization equilibrium. It would appear that 
initially the ionized gas cannot disrupt the protocluster gas; 
protostellar jets are a promising candidate for dissipation of 
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cluster gas, but the rate of momentum deposition is such that 
the removal time is rather long. If so, it is likely that the gas 
removal was, very roughly speaking, adiabatic. 

The original radius of the cluster would then have been r ~ 
0.24pc, while the original mass was M«3x2x 900M Q = 
5400M Q . The mean density was p«6x 10~ I8 gcm" 3 . The 
Jeans length Ay = 8 x 10 16 cm, and the Jeans mass Mj «4x 
10 32 g, o r 0.2 solar mass, simi lar to the value of m.\ = 0.6M Q 
found bv lMuench et al.l (12002b . 

The surface density of the proto-ONC gas was 



E = 



M 

4nr 2 



1.6 



M 



5400M« 



0.24 pcV _ 2 m 
— — J gem 2 (7) 



as seen from the center of the cluster. This is sufficiently high 
that we must consider the possibility that the clump was opti- 
cally thick to far infrared radiation. If we assume T = 2QK is 
correct, using the Rosseland mean opacity 



(8) 



dSemenov et al.ll2003l) we find r = 0.2. Hence the proto-ONC 
cloud was optically thin, but only just. 

If, on the other hand, we assume the cluster was optically 
thick, and calculate the accretion luminosity (as is done in the 
next subsection) we obtain T e ff = 14K, and we still find an 
optical depth less than unity; we conclude that the ONC was 
optically thin. 

2.2. Optically thick protoclusters lacking radiation pressure 

support 

The situation will change if the protocluster is optically 
thick to far infrared radiation, and if the accretion luminosity 
is sufficiently high. The temperature of the gas will then be 
higher than in optically thin clusters, potentially leading to a 
larger Jeans mass in optically thick clusters. For a metallicity 
Z gas the optical depth is 



r = k(2",Z)E c / « 1 
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(9) 



where n(T,Z) is the Rosseland mean opacity, and we have 
used the fact that n(T) ~ (T/\Q0K) 2 for T < 100K; for 
100K < T < 500K, k is roughly constant. 

We have scaled to the properties of a "metal rich" (one tenth 
solar) Milky Way globular cluster, using a typical present day 
(stellar) mass but a small half light radius. The reason for 
the latter choice is that the lockup fraction a(t) « 0.45; if this 
mass was lost from the cluster, the radius of the cluster has 
expanded by a factor of 2.2 since it formed. We noted above 
that obse rvations of the in tercluster medium in globular clus- 
ters, e.g jFreire et al.l d2001l) . show that the gas expelled from 
evolving stars is rapidly removed from the cluster. 

If the star formation was less than 100% efficient, the 
amount of expansion would have been larger. 

The heat source that, for massive or compact enough clus- 
ters, drives the gas above T » \QK is the contraction of the 
clump due to its own self-gravity. Consider a clump of mass 



M c i and radius r c i, contracting at roughly its free fall time T//. 
The cluster generates a luminosity 



GM 2 , , v 5 

= (f)ff 

r d Tf f G 



(10) 



where <f>ff is a dimensionless constant of order unity, and 



GM cl 
r c i 
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The luminosity of a 10 6 M Q cluster with r c i = lpc contract- 
ing at the free fall rate is 



1.9 x 10 4 V// 



, , \ 5/2 / x -5/2 



10 6 M 



lpc 



ergs 



(12) 

We stress once again that this luminosity is not related to any 
star formation. 

The effective temperature of the collapsing gas is 
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ForM c / = 10 5 Mq this is comparable to the temperature seen in 
star forming cores in the Milky Way, so we would not expect 
any change in the IMF. 

However, for clusters larger than ~ 10 6 M Q , T « 70K, and 
the interior temperature of the protocluster is higher still. To 
find the run of T with r, we solve the radiative transfer equa- 
tion in the diffusive lim it, employing the R osseland mean 
opacity as calculated by Se menov et al.l (120031) . We start at 
the surface of the clump, with a known luminosity, radius, 
and effective temperature. Since the clump is nearly in free- 
fall, we assume the density scales as p(r) ~ 1/r 2 . With p and 
T known, we can find the opacity. Integrating inwards, we 
solve for the temperature; then using our prescribed density 
and the new T, we find the opacity, and proceed inwards. 

Once we know T(r), n(r), and (our assumed) p(r), we cal- 
culate a mass weighted temperature. For the parameters used 
here, we find T mass sa 165 K, but this varies with the mass and 
metallicity of the cluster 



The corresponding Jeans length is 

Xjeans ~ 1-4 X 10 17 Cm, 



while the Jeans mass is 



M Jeans w 2.7</>jM G 



(14) 



(15) 



Figure [2] plots My as a function of M c \ for three different 
values of the metallicity. The cluster radius is set to l.Opc 
for 2OOOM < Md < M* a , where M* d « 1O 6 M is the mass 
at which r c / begins to rise as a result of the increased accre- 
tion luminosity, as discussed below in $3] For smaller masses 
r c i is chosen to match the present day mean cluster radius in 
the Milky Way, allowing for cluster expansion as the lockup 
fraction a(t ) decreases due to stellar evolution. 

The Figure shows that My first decreases with increasing 
M c i- At small cluster masses, where the cluster is optically 
thin to FIR radiation, the temperature is independent of M c i, 
while the density increases, so My decreases with increasing 
M c i- This is a result of our assumption that the initial cluster 
radius does not vary with cluster mass. This assumption can 
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FIG. 2. — The Jeans mass plotted against stellar cluster mass M c ;;M c i is the 
initial gas mass of the proto-cluster, i.e., the plot does not take account of any 
gas expelled from the cluster as the stars are formed. The metallicities are 
Z / ZQ = 1 (solid line), z/zq = 0.1 (dashed), and z/zq = 0.01 (dot-dash). The 
Jeans mass initially decreases with increasing M c \ , since T is constant while p 
increases (because we have assumed r c i is constant for 2000 < r c i < 10 5 Mq); 
this decrease is halted when the cluster becomes optically thick, leading to an 
increase in T and hence in Mj. The result is a rapid increase in Mj with 
increasing M c \. There is another change in behavior when the protocluster 
becomes radiatively supported, forM c ; > 10 6 Mq (depending on metallicity). 
See text for further details. 

be checked by measuring the IMF in globular clusters, and 
seeing if massive clusters show signs of a sharp decrease in the 
number of stars, binned by mass, as the stellar mass increases. 
This signature of the IMF may, however, be erased by the 
combination of mass se gregation and preferential eva poration 
of low mass stars, e.g., (Baumgardt & Makino 2003b. 

The Jeans mass rather abruptly jumps up (at M c \ ~ 10 5 - 
10 6 M Q , depending on metallicity); this jump is followed by 
a rapid but smooth increase. The jump reflects the fact that 
the cluster has become optically thick; it is artificially abrupt 
due to our crude treatment of the radiative transfer when the 
optical depth is near unity. The smooth increase reflects the 
increase in T driven by the increase in accretion luminosity 

5 /8 

with increasing M c i, T e ff ~ M l . Assuming a constant r c /, 
we findMy-M^ /I6 . 

There is a slight break in the slope at M c \ « 10 6 M© (for so- 
lar metallicity) related to the increase in r c / mentioned above 
and driven by radiation pressure support; at higher M c i the 
temperature actually decreases, although very slowly, T e ff ~ 

mJ . However, Mj continues to increase rather rapidly, be- 

-4/5 

cause the mean density actually decreases, p ~ M cl . We 
find Mj -m" 720 . 

We note that the Jeans mass increases rather strongly with 
increasing cluster mass; this will result in a mass to light ratio 
Ty that, for old clusters, will also increase with increasing 
M c i. We will return to this point below. 

2.3. Mass to light ratios of clusters with large Jeans masses 

The luminosity to mass ratio of the modified IMF will differ 
significantly from that of the standard Muench et al. IMF; a 



FIG. 3. — The luminosity to mass ratio for a modified Muench et al. IMF, 
plotted as a function of mi, interpreted as a Jeans mass, i.e. m\ =mj. The 
cluster age is 2.5 Myrs. The horizontal dotted line gives the Eddington ratio 
4irGc/K es = 6.6 x 10 4 cm 2 s~ 3 . 
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FIG. 4. — Fig. 4 — The luminosity to mass ratio for the clusters shown in 
Figureff] at an age of 2.5 Myr. The jump in L/M occurs at the mass where 
the cluster becomes optically thick to the far infrared radiation. 

larger fraction of the resulting young stellar cluster is in mas- 
sive stars, leading to a higher luminosity to mass ratio. The 
luminosity to mass ratio as a function of the Jeans mass is 
shown in Fig. [3]for a cluster with an age of 2.5 Myrs. The hor- 
izontal dashed line is the Eddington luminosity to mass ratio, 
4wGc/K es , where K es w 0.38 cm 2 g" 1 is the electron scattering 
opacity. 

Figure [4] shows the luminosity to mass ratio for the same 
cluster models shown in Fig. [2] Recall that these models had 
r cl = 1 pc for 10 4 M Q < M d < 10 6 M Q . It is not clear that this is 
the correct mass-radius relation to use. However, the general 
trend of a slowly varying L/M ratio for small mass clusters, 
with an increase for masses above some value (~ 1O 5 M in 
the Figure) should be correct for the actual mass-radius rela- 
tion for massive protoclusters. 

While these optically thick clusters have high L/M ratios 
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FIG. 5. — The mass to light ratio Ty = M/Ly, in solar units (so- 
lar mass over solar v-band luminosity) for an optically thin (solid line, 
mj = 0.5Mq) and an optically thick (dashed line, mj = lO.OMp)) cluster as 
a functi on of cluster age, a s calculated by Starburst99 ( Leitherer et al] |1999t 
IVazquez & Leitherer 2005). The metallicities (Z = 0.2Zq) and initial stellar 
masses of the two clusters are the same. When the clusters are young, the op- 
tically thick cluster is more luminous than the optically thick cluster, but after 
a few gigayears the optically thin cluster is more luminous than the optically 
thick cluster. 

compared to optically thin clusters when they are young, they 
have low L/M ratios, or high M/L ratios, when they are more 
than a few Gyrs old, as illustrated by the lines in Figs. [5] and 
[6] The reason is clear; when young, the cluster luminosity is 
dominated by massive stars, and optically thick clusters have 
more massive stars per unit mass (and hence fewer low mass 
stars per unit mass) than optically thin clusters. However, af- 
ter all the massive stars have evolved, the cluster light is sup- 
plied by low mass stars, and the optically thin clusters have 
more 0.6-0.8M Q stars per total mass than do the optically 
thick clusters; the optically thick clusters have more stellar 
remnants (primarily white dwarfs, with some neutron stars 
and possibly black holes) than the optically thin clusters. 

3. RADIATION PRESSURE SUPPORTED PROTOCLUSTERS 

Another dramatic change occurs when the protocluster is 
massive enough that the accretion luminosity is dynamically 
important. The protocluster has an FIR optical depth larger 
than one, so the outward force exerted by the radiation is 



If this force is less than that of gravity, the protocluster will 
continue to shrink, but as r decreases the radiation force in- 
creases as r~ 9 / 2 , while the force of gravity increases as r~ 2 . 
For small enough r the radiation pressure overcomes the force 
of gravity and the collapse is slowed. 

The optical depth r is measured from the center of the 
protocluster outward. Equation dT6b assumes that the opti- 
cal depth is independent of direction, which is not strictly 
speaking true given the turbulent nature of the cluster gas. 
The optical depth is proportional to the column density 
of gas. The column density distribution has been mea- 
sured in the Milky Way by a nu mb er of autho r s, e.g . , 
iGoodman. Pineda. & Schneel d2008l) or IWong et al.1 d2008l) . 
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FIG. 6. — The mass to light ratio Ty = M/Ly as a funct ion of dynami - 
cal mass M c /. The squares are M ilky Way globular clusters lHarrisj 1199611 : 
velocities from Pryor & Meylan ( 1993)). Triangles a re M31 globular clus - 
ters (for which velocity dispersions are available) from B armbv et al.1 120071) . 
The filled pentagons are globular clusters near NGC 5182 (Cen A) from 
Reikubaetal. (2007). The filled hexag ons are UCDs from IHasegan et all 
I2005D (magenta). Virgo UCDs from Evstigneeva et al. ( 2007) (red), and For- 
nax UCDs iHilker et al. 2007) (cyan ). The open blue c ircles are the nuclei of 
Virgo dwarf elliptical galaxies from Geha et al. (2002). The large filled cir- 
cles are the mean Ty in three mass bins M c l < 10 6 Mq, 10 6 Mq < M c [ < 
10" 'Mq, and 10 7 Mq < M c [. The mass to light ratios of all clusters have been 
adjusted for the effects of two body relaxation and tides, a s de scribe d in the 
text. The solid lines give the predictions of equations (2)> 1 1 3t . and (T7J for 
a solar metallicity cluster (top curve), for a metallicity of 0.2 solar (middle 
curve), and a metallicity of 0.02 solar, all at an age of 10 Gyrs. 

and has been found to be consistent with a log-normal dis- 
tribution. Numerical sim ulations also find log-normal sur - 
face density distributions ( Ostrike r Stone & Gammiell2001h . 
The observations measure r along the line of sight from 
the Earth through the cloud rather than from the center of 
the cloud outward, but the two surface density distributions 
sh ould not differ dramatically. In the notation of appendix 
lAl IGoodman. Pineda. & Schneel d2008h find 0.11 < a < 0.22, 
correspo nding to 0.01 < n < 0.05. This ag rees well with the 
results of lOstriker Stone & Gammid d2001h . 

For /J, = 0.05, 99% of sight lines have t/t > 0.2, where f 
is the (angular) mean of the optical depth. Since f w 50 for 
M c i = 10 6 M Q and r c / = lpc, there are essentially no optically 
thin sight lines for such massive clusters. In fact, the radi- 
ation pressure does not become dynamically important until 
the cluster is smaller than 1 pc and f is larger, but this only 
decreases the chance that photons leak out in directions with 
small optical depth. 

When the radiation force approaches that of gravity, the col- 
lapse will slow from the free fall rate to the Kelvin rate. In 
appendix [C] we show that this occurs at a radius 

rad = (^f 5 Mf « 1.5 x 10" (^) 3/5 

(j^-r) V \¥) cm. 

(17) 

It can be shown that this is also the radius at which the photon 
diffusion time out of the clump is equal to the clump dynami- 
cal time. 

These radiation supported clusters have a monotonically 
increasing r c ;(M c ;) relation, unlike the less massive globular 
clusters found in the Milky Way and other nearby galaxies. 
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This will change a number of their properties. For example, 
their surface densities will decrease with increasing mass, 

(18) 

where we use £ = M /{Anr 2 ). This scaling with cluster mass 
is in contrast to globular clusters, which have surface densities 
that increase with M c ;, S(M C ;) ~ M c i ■ The volume density will 
decrease even more rapidly 

p(M cl )~M-J /5 . (19) 

This will have consequences for the evolution of binary sys- 
tems, in particular the precursors to LMXBs and millisecond 
pulsars. Since the formation of such binaries depends strongly 
on the stellar number density, the number of LMXBs and mil- 
lisecond pulsars per stellar mass will peak at a cluster mass 
-3 x WMq. 

The escape velocity will also behave differently for the 
more massive clusters, 

(20) 

1/2 

compared to the much more rapid v{M c {) ~ MJ for globular 
clusters. 

It follows that the accretion luminosity L ~ M c i as expected 
(assuming k is constant) since the cluster is limited by the 
Eddington luminosity. Furthermore, the effective temperature 
will actually decrease with increasing cluster mass, although 

very slowly, T e ff ~ M c ^ 20 . 

The stellar luminosity is another question. When the stars 
ignite, they will increase the luminosity of the cluster. This 
will tend to increase the radiation pressure, and in turn the 
radius of the cluster. However, as soon as r begins to increase, 
the accretion luminosity will drop so as to maintain the total 
luminosity near the Eddington value. If the stellar luminosity 
exceeds the cluster Eddington value, the stars may expel any 
remnant gas on a dynamical time. 

However, the maximum stellar luminosity depends on the 
IMF, and is generally well below the accretion luminosity, as 
we now show. The accretion luminosity to mass ratio is 

— = -. (21) 

Mci H es K d 

The first factor on the right hand side of this equation is the 
Eddington light to mass ratio, {L/M)Edd = 6.6 x 10 4 cm 2 s~ 3 , 
while the opacity ratio ranges from 0.1 for solar metallicity 
dust (and T sw 100K) to 10 for a low metallicity cluster. From 
Figure [3] or 2] the maximum stellar L*/M* «2x 10 4 cm 2 s" 3 , 
i.e., about 1/3 of the Eddington L/M ratio. Only for the most 
massive and metal rich clusters will the stellar luminosity ap- 
proach (or slightly exceed) the accretion luminosity. How- 
ever, the largest clusters actually have lower temperatures, as 
we have just seen, so is likely to be a bit lower than the 
rough estimate n = 3 we used here. We conclude that the stel- 
lar luminosity is lower than the accretion luminosity, at least 
until most of the gas has turned into stars. 

This assumes that no other mechanism can eject gas from 
the cluster. One mechanism that might do so is a combination 
of protostellar jets and stellar winds. Recall that the Jeans 



mass is of order 5- 10M Q in these clusters, so a substantial 
fraction of stars will have massive winds. Both the protostel- 
lar winds and jets will have mechanical luminosities less than 
a tenth of the stellar Eddington luminosity. These winds and 
jets will shock, producing gas at T = 10 9 K, but this gas will 
rapidly cool by conduction to T w 10 7 K. From that tempera- 
ture it will cool radiatively in a time of order 10 4 years, less 
than a dynamical time, as we showed above in $2] We tenta- 
tively conclude that jets and winds will not expel the bulk of 
the cluster gas. 

Given that observed clusters are only a factor of a few times 
larger than the minimum radius allowed by radiation pressure, 
it would appear that the fraction of gas ejected may be of order 
1 /2, but not substantially larger. We leave this question for 
later work. 

As with the optically thick accretion heated clusters dis- 
cussed in jj2.2l the IMF of these clusters will be top heavy, 
with Jeans masses ranging up to 10M Q , as shown in Fig. [2] 
When they are young, they will have elevated light to mass ra- 
tios; when older, their mass to light ratios will be higher than 
ordinary optically thin globular clusters of the same age and 
metallicity; see Figure|6]and the next subsection. 

The very high mj predicted by these models suggests that 
the lockup fraction a(t) will be particularly small for these 
very massive clusters, perhaps as small as 0.3 or less (Fig. 
Q]). We have noted above that the gas expelled from evolving 
stars is rapidly r emoved from present day globular clusters 
dFreireetalJl2001l) . The escape velocity from the more mas- 
sive clusters considered here is only slightly higher than in 
globulars, so the same gas removal mechanism, whatever it 
is, should be effective. The radius of a cluster of age r will 
then depend on the present day stellar mass as 

2/5 

r c/ (r,M d ) = a(r)- 8/5 0™dG 1/5 (£^) M e 3 / 5 . (22) 

This relation is shown as the dashed line in Figure[8]for a(t ) « 
0.38, appropriate for a 10 Gyr old cluster with a Jeans mass 
of2M . 

If these massive clusters have larger Jeans masses than op- 
tically thin globular clusters, they will experience a larger rel- 
ative expansion in r c ; than the globular clusters. Evidence for 
this expansion may be detectable in old massive clusters as 
a surrounding halo of stars extending out to the tidal radius; 
both the larger relative expansion and the larger number of 
stars in the massive clusters will render this halo more visible 
than similar halos around globulars. 

The small lockup fraction will also alter the present day 
values of derived quantities such as surface density, relative 
to the initial values give above. For example, the surface den- 
sities will be smaller by en 3 , as illustrated by the dashed line 
in Figure [9] 

3.1. The relation between luminosity and velocity dispersion 
in massive old clusters 

We noted above that the mass to light ratio Yy will be an 
increasing function of cluster mass, even at fixed metallicity. 
This will lead to a relation between luminosity and velocity 
dispersion that differs from that expected from a simple ap- 
plication of the virial theorem, L~ a 5 . 

We start with the observation that the luminosity of a clus- 
ter depends on the IMF, or, in our case on the Jeans mass. 
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Consider an old cluster, in which the turnoff mass (the main 
sequence lifetime of a star at the turnoff mass is equal to the 
age of the cluster) is smaller than the Jeans mass. The mass 
to light ratio of such a cluster is an increasing function of the 
Jeans mass; the cluster light is coming from stars on the fiat 
part of our adopted IMF, and the number of such stars (at a 
fixed initial cluster mass) decreases with increasing My. An- 
other way to say this is that a larger My leads to more mass 
in massive stars, which contribute no light at late times, leav- 
ing less mass in (and fewer numbers of) low mass stars to il- 
luminate the cluster's declining years. Hence Ty = M^/Ly, 
where the subscript i denotes the initial cluster mass, increases 
with increasing My. The mass to light ratio for our IMF scales 
as Ty(M/) ~ My 73 , as long as the turnoff mass is below My. 

The observationally accessible quantity is Ty(M c ;); we 
measure the present day cluster mass, not the initial cluster 
mass. For clusters with T e ff > 10K (optically thick clus- 
ters with sufficiently high accretion rates to heat the gas, but 
not to provide radiation support) we can estimate the scaling 
Ty(M e/)! ) ~ M°, 32 . The exponent is 0.73 times 7/16, the lat- 
ter arising from the scaling relation between My and the initial 
M c i, My(M c ;,,) ~ M 7 j ( 16 , found in §12.21 The present day M c i is 
smaller than M c \ j by (at least) a factor a. Given M c / ,-, we can 
calculate My and hence a, whence we can find the present day 
M e ;. Doing so, we find 

Ty(M c ;) ~ M^ 28 , (23) 

The situation for radiation supported clusters is similar, but 
the Jeans mass varies more rapidly with the initial cluster 

mass, My ~ M^^. This occurs despite that fact that the ef- 
fective temperature of the cluster actually decreases with in- 
creasing M c i\ the decreasing temperature is more than offset 
by the rapidly increasing cluster radius (or decreasing den- 
sity). The result is Ty(M e/ ) ~ M°jf, and 

Ty(M d )~MS n . (24) 

Using the scaling of mass to light ratio Ty with cluster mass 
M c /, we can find the relation between the stellar luminosity 
and the cluster velocity dispersion a = v / %/r y! >, where the 
constant r n > is defined by M,, ir = r,,,>cr 2 r c / /G. We start with 

''= s » lf '»(s)fe)"* < 25 > 

where Yy.6 is the mass to light ratio for a cluster of 10 6 M Q . 
Combining this with equation (l20l we find 

(26) 

for the present day luminosity of a radiation pressure sup- 
ported cluster formed 10 Gyrs ago. For optically thick clusters 
that were not radiation pressure supported, Ly ~ er 3,6 . 

4. COMPARISON WITH OBSERVED CLUSTER PROPERTIES 

We start by considering the light to mass ratio L/M for 
young clusters, then the mass to light ratio M/L for old clus- 
ters 3 . Either ratio is affected by a number of parameters, in- 
cluding age, IMF, and metallicity. A cluster's L/M depends 

3 We reverse the ratio with advancing age since this is the practice in the 
literature 



most strongly on age, moderately weakly on the IMF (for the 
type of model considered in this paper), and very weakly on 
metallicity. 

Figure [5] shows that M/L varies by a factor of several hun- 
dred over 10 Gyrs for a given M c \ and metallicity. Figure [6] 
shows that over the observed range of cluster masses, varia- 
tions in the Jeans mass should produce a variation in the mass 
to light ratio (at 10 Gyrs) from ~ 2M Q /Ly i0 at M cl = 10 5 M Q 
to ~ 6 at Mci = 1O 8 M , in increase of more than a factor of 3. 

The metallicity of a cluster affects the Ty of a cluster 
with a fixed IMF, since main sequence metal poor stars of a 
given mass tend to be more luminous than metal rich stars of 
the same mass and age. However, the effect is not strong: 
Figure [6] shows that a factor of five change in metallicity 
changes M/Ly (at an age of lOGyrs) by about 42% around 
M c i = 1O 6 M , but by only 20% near M c , = 1O 8 M . Since 
the metallicity can be obtained with moderate precision from 
spectroscopic studies of a cluster, this direct source of varia- 
tion in M/L is reasonably easy to account for. 

The metallicity c an also indirec tly affect our estimate of 
M/L. For example, [Jordan (2004) has pointed out that high 
metallicity clusters can appear to have smaller half light radii 
than low metallicity clusters of the same age, mass, and IMF; 
the mass of turnoff stars (which dominate the light of the clus- 
ter) is smaller in the metal poor cluster than that in the metal 
rich cluster. The variation in radius is ~ 20% for a factor 
of ten variation in metallicity, and the change in the apparent 
mass is of the same order. 

We conclude that clusters with M c \ ps 1O 8 M should have 
M/Ly 6 or higher, more than a factor of two higher than 
globular clusters, and that metallicity variations cannot pro- 
duce such a large change in M/Ly. Armed with these results, 
we proceed to examine various clusters for evidence of a top- 
heavy IMF. 

4.1. The light to mass ratio of young clusters 

In M82 the luminosities, ages, and projected half light 
radii of a number of s t ar clusters have been measured 
(ISmith & Gallaghedl200TJ: iMcCradv et af]l200l ISmitTilTall 
2006k the first two papers also measured line of sight veloc- 
ity dispersions for a total of three clusters. The line of sight 
velocity dispersions for 19 clusters have been measured by 
IMcCradv & Graham! (l2007h . 

Both ISmith & Gallagher! d200l and IMcCradv etail d2003l) 
present evidence that cluster M82-F has a luminosity (in the 
F160W filter) to mass ratio about a factor of ~ 2.5 higher 
than expected for a cluster of its estimated age (~ 50 Myrs). 
They suggest a lower mass cutoff to the IMF of 2.5M and 
~ 1M , respectively. The projected half light radius and mass 
of M82-F are r c{ = 89 mas, about 1.5pc at 3.6 Mpc, the dis- 
tance to M82, and M* = 5.5 x 1O 5 M . Using these values, 
and a metallicity 1 .7 times solar, we find that the Jeans mass 
in M82-Fis~2.5M . 

The status of M82-F is the sub ject of some debate in the 
literature; see Bas tian et al.l (12007) for an extended discussion 
of this object. These authors find a high T, as have previous 
authors. They offer several possible explanations, including 
a top-heavy IMF, but also mass segregation (not seen with 
their data) and inaccurate estimates of the velocity disper- 
sion resulting from spatially variable extinction. They rule 



10 



out the suggestion of [Bastian & Goodwin! (120061) that M82-F 
is younger than 20 Myrs. 

Similarly, iMcCradv et al.l d2003l) find that their cluster 11, 
with r hp = 1.2pc and M* = 3.5 x 10 5 M Q , has L/M about 2.5 
times larger than expected, again suggesting a top heavy IMF. 
Any mass segregation in this cluster would have to be primor- 
dial, since the cluster is so young, around 10 Myrs. For this 
cluster, our models predict a Jeans mass of 2.3M Q . 

In contrast to these r esult s for clusters M82-F and M8 2-1 1, 
IMcCradv etaD (120031) and IMcCradv & Graham! (120071) find 
that their cluster 9 has r hp = 2.6 pc and M* = 2.3 x 10 6 M Q . 
This cluster has L/M consistent with a normal IMF, while our 
models predict a large Jeans mass and a high L/M ratio. 

The other clusters in M82 are optically thin, so we do not 
expect them to have elevated L/M ratios. 

Both lSmith & Gallagher! J200l and IMcCradv etaTI d2003l) 
consider the effects of mass segregation (heavier stars ei- 
ther forming prefer e ntially or settling into the cluster center). 
Smith & Gallagher ( 200 1]) a rgue that this is unli kely to ex- 
plain the higher L/M, while IMcCradv e t al. (2003) are more 
circumspect. 

Boil v et all d2005) find that the dynamically evolving mass 
segregation will cause the half light radius to decrease, weak- 
ening the argument for an enhanced light to mass ratio in 
M82-F, but not in cluster 1 1 (due to its youth, dynamical mass 
segregation is not important in this cluster). 

Mo ving to young super star clusters in galaxies other than 
M82, IBastian et all d2006l) list eleven clusters less than 300 
Myrs old; of these only NGC 4038:W99-15 is optically thick. 
Like M82:9, this cluster has a light to mass ratio consis- 
tent with a normal IMF. Other massive clusters listed in 
IBastian et al.l d2006b . such as NGC7252:W3, NGC7252: W30 
and NGC 1316:G114, w hich are optically t hick, are between 
500Myr and 3Gyrs old ( IBastian et al .1120061) . so they are also 
not expected to have elevated L/M ratios (see Fig. |5). 

To summarize, we are aware of only four young (less 
than 300Myr old) clusters in the literature that were probably 
born optically thick to far-infrared radiation, M82-F, M82-1 1, 
M82-9, and NGC 4038:W99-15. The first two show signs of 
having elevated light to mass ratios, while the latter two do 
not. 

4.2. The mass to light ratio of massive star clusters 

Figure [6] shows observed mass to light ratios for several 
classes of star clusters, including globular clusters from the 
Milky Way, M31, and Cen A, more massive UCDs from the 
Virgo and Fornax clusters, and four nuclei of Virgo dwarf el- 
lipticals. Where available we have used masses from the liter- 
ature obtained by fitting models to individual clusters. Where 
such detailed fits are not available, we have calculated the 
masses from the observed velocity dispersions a and half light 
(or effective) radii r c \ using the expression 



M = r VI> 



cr 2 r c i 



(27) 



We use r v! > = 10. We use only objects for which the quoted 
error for a is less than half the value of a. 

Binning the clusters by mass (M ci < 10 6 M Q , 10 6 < M d < 
10 7 M Q , and 10 7 M Q < M d ), we find T v = 1 .9 ± 0.8, 2.8 ± 1 .2, 



and 4.7 ±1.6 respectively, all in solar units, consistent with 
the impression given by the points representing individual ob- 
jects. This rapid increase in T with increasing c luster mass (or 
lumino sity) has been noted previously, e.g., by Hasegan et al. 
(HH). 

The mass to light ratio T of a cluster changes with age 
due to both stellar evolution and to differential loss of low 
mass compared to high mass stars. The latter is a result 
of two body relaxation com bined with tidal stripping, e.g., 
iBaumgardt & Makinol d2003l) . As Fig. 0shows, the effect of 
stellar evolution is to increase T with increasing cluster age. 
The solid curves in Fig. |6]assume that the clusters are 10 Gyrs 
old. Some of the UCDs may well be this young or younger; if 
they are younger, their T values should be increased slightly 
to compare to the globular clusters, since the latter almost cer- 
tainly are 11-12 Gyrs old. 

Metallicity also affects T, but the majority of the objects 
plotted in the figure, including the UCDs, have [Fe/H] < -1. 
It is worth stressing that for such low metallicities both the 
Kroupa and Muench et al. IMFs predict Yy < 3 for these low 
metallicities. 

4.2.1. Dynamical effects 

Variations in T due to relaxation and tides are most relevant 
for low mass objects. Relaxation initially reduces T as low 
mass stars are tidally stripped from the outskirts of the cluster, 
then increases T just before the cluster is disrupted at time 
Tdis- The disruption time found by the numerical simu lations 
is well approximated by (Baumga rdt & Makinol l2003) 



T dis = (3 



R c 220km s 
kpc v c 



(28) 



where the values of the constants are (3 w 1.9, 7 0.02, and 
x « 0.75. The cluster is assumed to orbit at a mean radius Rc 
from the center of the galaxy on an orbit with eccentricity e. 
The galaxy has a circular velocity v c (or a velocity dispersion 
which can be converted to an equivalent circular velocity in 
the case of elliptical hosts). The initial number of stars TV* = 
M c i I (m), where < m > is the mean stellar mass for the chosen 
IMF. 

The cluster mass at time T is related to the initial cluster 
mass M c /.,- by 



M c ,(T) = 0.70M cU (l-T/T dis ). 



(29) 



AT ps r — 



From Baumgardt & Makino (2003) (their Figure 14) we ap- 
proximate 

— I , (30) 

where 0.3 < T < 0.7. From the observed cluster mass (i.e., 
M c i(T)) we solve equations d28l and (1291 . then use equation 
(f30b to find AT. Where the eccentricity of the globular cluster 
orbit is unknown (the majority of the cases) we assume e = 
0.5. 

In producing Fig. [6] we have increased the mass to light 
ratio of all the clusters by the appropriate amounts, using V = 
0.7; the only noticeable change (typically of order 10-30%) 
is that suffered by the Milky Way globulars, since they are the 
least massive and hence have the shortest 7^, s . However, even 
for Milky Way clusters the effect is not large, and neglecting 
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recent high precision measurements of the 
mass function (MF) in tw o globular clus- 



this correction does not alter the qualitative appearance of the 
figure. 

There are 
present day 

ters, M4=NGC 61 21 dRicheret al.l [2004b and NGC 6397 
dRicher et al.ll2.Q08b . These were chosen for very long HST 
observations partly based on the fact that they are the two 
nearest globular clusters. 

The metallicity of NGC 6397 is [Fe/H] = -2.03 
(iGratton et al.l |2003|). wh ile the age of the cluster is 11.4 
Gvrs dRicher et all 120081). Using m v = 5.73 and r h = 2.33' 
dHarrislll996h~ the error-weighted mean velocity dispersion 
a = 3.5 ± 0.2km s" 1 dPrvor & Mevlanl 1 19931) . and distance 
modu lus m-M= 12.1 ±0.1 or D = 2 .6 ± 0. 1 kpc dRicher et al l 
120081) . the dynamical mass to light ratio of NGC 6397 is 
Ty = 1.6 ±0.2 in solar units. This is marginally consistent 
with a Muench et al. stellar model with that age and metallic- 
ity, (Ty = 1.96) or for a Kroupa IMF (Ty = 2.1). 

Using data from iRicher et al l d2004l) we find Ty = 1.2 ± 

0. 1 for M4, too low to be consistent with either type of IMF, 
but consistent with a maximal amount of dynamical evolution, 

1. e., T = 0.7 and an age approaching the cluster disruption age. 

Applying the correction for preferential loss of low mass 
stars to NGC 6397 we find Ty = 2.1, while for M4 we 
find Ty = 1.5, the latter now being marginally consis- 
tent with the Muench et al. IMF. This suggests that M4 
is more dynamically evolve d than our application of the 
dBaumgardt & Makinoll2003l) would indicate. 

The present day mass functions of both these clusters can 
be fitted by a single power law dN /dm ~ m~ a with a = 0.1, 
compared to the Salpeter value 2.35 or the Muench et al. value 
1.15 (for mass es below the break mass). Thi s is reminiscent of 
the findings of Baumgardt & Makinol d2003l) . who predict that 
the MF of evolved clusters will be very flat. However, the sim- 
ulations predict a«0 only when 90% of the cluster lifetime 
has passed. It seems somewhat unlikely that the first two clus- 
ters examined (chosen for their proximity to us) should both 
be so near their demise. The rather modest dynamical evo- 
lution needed to explain the slightly low Ty for NGC 6397, 
coupled with the very low a, suggests some primordial mass 
segregation in that object. 

4.2.2. Non-baryonic dark matter? 
The high Ty ~ 6 seen in some UCDs studied by 



Haseganetal. (2005) led them to suggest that their objects 
might contain a mass in non-baryonic dark matter compara- 
ble to their stellar mass. This suggestion is motivated by the 
following argument. The mean Ty w 2.0 ±0.9 for Milky Way 
globulars, using dynamical masses corrected for the preferen- 
tial loss of low mass stars using eqns. d28T i. (|29l , and (l30l (as 
noted above, uncorrected values of T v are about 20% lower 
on average). In contrast, Ty w 4.7 ± 1.5 for M d > 1O 7 M . 
As mentioned above, most of the more massive objects have 
[Fe/H] < -1, so neither Kroupa nor Muench et al. IMFs can 
match the observations. Many of the UCDs appear to have 
somewhat younger stellar populations than do the globulars, 
adding to the difficulty. 

Possible explanations for different values of Ty include 
problems with the estimate of the mass to light ratios (e.g., 
poorly measured velocity dispersions), dynamical effects such 
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FIG. 7. — Mean cluster density p = 3M c i/(4irr ?,) as a function of cluster 
dynamical mass. The symbols are as in Fig. 6, with the following addi- 
tions: Open triangles are M82 superstar clusters from McCrady & Graham] 
120071). Open magenta circles are nuclear star clusers from IWalcher et all 
12005). Small crosses represent the mean density of Milky Way satellite 
dwarf galaxies evaluated at their core radii (typically substantially larger than 
lOpc). The small filled (green) and open (red) squares represent the satel- 
lite galaxy densities extrapolated to lOpc using either an NFW profile or a 
Moore et al. (1999) profile. The solid line at the upper right represents the 
initial density of radiation pressure supported star clusters (assuming 100% 
star formation efficiency, while the dashed line shows the radius after stellar 
evolution has reduced the mass to a = 0.4 of its initial values. The right hand 
vertical axis is labeled by the density in solar masses per cubic parsec. Note 
that the dark matter dominated dwarf galaxies have mean densities at lOpc a 
factor of (at least) 30 less than the typical density of massive star clusters. 

as the preferential loss of low mass stars, changes in the IMF 
associated with increasing cluster mass, or the presence of dy- 
namically significant amounts of non-baryonic dark matter in 
the more massive clusters. In the latter case, the dark matter 
would have to have a mass 1 .3 times larger than the baryonic 
matter inside r c \ in cluster with M c \ > 10 7 M Q , assuming an 
IMF that is the same as that in Milky Way globular clusters. 
In other words, the massive clusters and UCDs would have to 
be dark matter dominated inside ~ lOpc. 

Given what little we know regarding dark matter, this is 
unlikely, as we now show. 

Figure [7] shows the density of the same objects shown in 

Rg. m 

If non-baryonic dark matter is responsible for the elevated 
mass to light ratios of clusters with M c \ ~ 10 7 M Q , it must have 
a density > 10~ I9 gcm" 3 , or 1000M Q pc" 3 on scales of order 
lOpc. 

We have examples of dark matter dominated objects for 
which the density is well measured, namely Milky Way dwarf 
spheroidal galaxies. These objects have stellar velocity dis- 
persions ranging from a = 4km s _I to 27.5km s _I , the latter 
corresponding to the LMC. The crosses in Fig. [7] show the 
central densities 



166ct 2 



-M Q pc- 



(31) 



of Milky Way dw arf spheroidal galaxies as tabulated by 
Mada uet al.l d2008l) . In this expression r c = 0.64r c / is the core 
radius of the stellar light (recall that r c \ is the projected half 
light radius). 
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The highest density Milky Way satellite galaxies currently 
known, Willman 1 and Coma Berenices, have p = 10~ 21 g cm" 3 
and 10~ 22 respectively, a factor of one hundred to one thou- 
sand below the mean density of the compact M c i = 10 7 M Q 
clusters. More massive dwarf galaxies have much lower mean 
(core) densities, typically p ~ 10~ 23 gem -3 ; all these dark mat- 
ter densities are far to small to explain the high densities seen 
in compact clusters. 

The two highest density satellite galaxies (Willman 1 and 
Coma Berenices) are also the most compact Milky Way galax- 
ies, with r c = 13pc and 41 pc, respectively. This raises the 
possibility that the larger satellites have higher densities at 
radii smaller than r c , as expected on theoretical grounds. In 
fact, numerical simulatio ns find dark matter h alos that follow 
a density profile given by iNavarro et al.l ( 1 1997b 

(r/r 5 )(l + r/r s y 

(the NFW profile) or the slightly steeper iMoore et alj dl999l) 
profile 

P(r) = (r/r s y- 5 (l + (r/r s y-5) ' (33) 

If we assume r s » r c for the Milky Way dwarf satel- 
lites, we can scale the density to lOpc; doing so, we find the 
small filled square s in Fig. [7] ( NFW profile) and the small 
open squares (the IMoore et al.l ([1999) profile). The maxi- 
mum mean density at lOpc is similar to that of Willman 1, 
p sa 10~ 21 gcm" 3 , far too small to explain the high mass to 
light ratios of the massive star clusters and UCDs. 

We show in appendix [D] that the relatively low inferred 
mean dark matter density at r = l Opc is consistent with the 
highest resolution simulations, e.g.. lDiemand et aD (120071) . 

We conclude that compact massive star clusters (GCs and 
UCDs) are not (non-baryonic) dark matter dominated. This 
does not mean that they contain no non-baryonic dark matter: 
if they form in the center of their own dark matter halo, the 
baryons will, when they collapse, gravitationally compres s 
the inner part of the dark matter halo ( Blume nthal et al.l!986l) . 
However, simple calculations show that the fraction of dark 
matter inside r c i for objects as concentrated as the compact 
clusters is typically less than ~ 30%. 

4.3. The mass-radius relation for massive clusters 

We have already noted the striking observational result that 
both young moderately massive (10 4 M© < M ct < 1O 6 M ) 
clusters and (old) globular clusters show no systematic vari- 
ation of radius with mass. Equation (fTTT i predicts that the 
mass of radiatively supported protoclusters should have radii 
that increase with increasing mass. We therefore expect that 
evolved massive clusters will inherit this mass-radius relation. 
Figure [8] shows the mass radius relation for low mass clus- 
ters in the Milky Way, and high mass clusters and ultra com- 
pact dwarfs (UCDs) from a variety of external galaxies. The 
prediction of equation $17\ is shown as the solid line in the 
Figure. This is the initial radius of a cluster, before it has 
evolved, and should be compared with young massive clus- 
ters such as those in M82 stud ied by iMcCradv et al.l (120031) 
and|McCrady & Grahaml(l2007l) (the open triangles in the Fig- 
ure). 
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FIG. 8. — Cluster radius in parsecs plotted against cluster mass (in solar 
masses). Symbols are as in Fig. UJ with two additions: Small rilled squares 
are embedded (young) Milky Way clusters from Lada & Lada ( 2003), while 
open squares with diagonals are massive young Milky Way clu sters . The 
solid line shows the radiation radius computed from equation (17) . The 
dashed line shows the cluster radius after mass loss (corresponding to a 
lockup fraction a = 0.4) assuming adiabatic expansion of the cluster. 

The dashed line is the predicted relation for old massive 
clusters; the initial M c \ has been reduced by the lockup frac- 
tion a (taken to be fixed at 0.4, although it will vary with M c i), 
while r c i has been increased by the same factor. This line 
should be compared to the old massive clusters and UCDs. 

The fact that the low mass clusters (M c i <5 x 10 5 M Q ) are 
well above the line indicates that something other than radia- 
tion supported these clusters when they formed, i.e., that ra- 
diation support was not important in their evolution. In con- 
trast, the more massive clusters lie near the radius at which 
radiation support becomes important, and their radii do show 
a trend of increasing radius with increasing mass. This lends 
some credence to the notion that radiation plays a role in the 
formation of the most massive star clusters. 

5. DISCUSSION 

The properties of star clusters with masses ranging from 
~ 100M Q to 10 8 M Q vary continuously with cluster mass, as 
seen in Figures [6] through [9] There are sharp changes in the 
slope of cluster radius at M d w 10 4 M© and at M cl w 1O 6 M , 
and associated features in S, p and cluster escape velocity at 
these masses. 

We have argued that the change of slope in r c i(M c i) at 
M c i = 10 6 M Q arises from the emergence of radiation pressure 
as a dynamically significant player in the formation of these 
massive clusters. The feature at M c \ = 10 4 M© remains unex- 
plained. 

In contrast to the observation of two changes in the slope 
of r c [ with M c i, there is a single change in the slope of TV, 
at M c i w 1O 6 M . We assert that this is due to the abrupt in- 
crease in gas temperature with increasing cluster mass, as- 
sociated with the change from radiatively thin to radiatively 
thick cooling. The sharp jump in T(M C /), the slower change 
of p{M c /), and the stronger dependence of Jeans mass on T 
(Mj ~ r 3 / 2 /p'/ 2 ) results in a rapid increase in Jeans mass 
with increasing cluster mass above M c \ = 1O 6 M . We have 
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FIG. 9. — Cluster surface densityplotted against cluster mass (in solar 
masses). Data points are as in Fig. [8] The solid line shows the max imum 
(initial) surface density allowed by the Eddington limit, equation U8t . The 
dashed line shows the surface density after ~ 10 Gyrs, with a lockup fraction 
a = 0.4. The right hand vertical axis is labeled in units of solar masses per 
square parsec. 

identified this as the underlying cause for the increase in Ty 
with increasing M c i above ~ 10M©. 

The results regarding the IMF found here (Figs. [2] [5] and 
the lines in Fig. [6]l show that Ty should increase with in- 
creasing cluster mass even in the absence of non-baryonic 
dark matter, up to values as large as 6, and possibly higher 
(the bulk of the extra mass is, in the case of a top heavy IMF, 
in the form of white dwarfs). We have also argued that non- 
baryonic dark matter in Galactic satellite galaxies on scales of 
~ lOpc has densities smaller by a factor of 30 than the den- 
sities observed in massive star clusters and UCDs. It appears 
unlikely that the high mass to light ratios of the compact clus- 
ters are due to non-baryonic dark matter. 

iFellhauer & R roupa (2002) model UCDs as merged globu- 
lar clusters. Their clusters, at M c [ = 2 x 10 7 M Q and with radii 
ranging from 39 to 72 pc or larger, are larger than all but one 
of the objects plotted in Figure[8] 

iBekkiet al.ld200lT) and lBekki et aT]d2003b argue that UCDs 
are the remnants of nucleated dwarf galaxies; the nuclei lose 
their envelopes due to tidal stripping in the gravitational field 
of their host cluster halos. This begs the question of how the 
nuclei formed. To address this question, iBekki et al.l (2004) 
explore the same mechanism as Fellha uer & Rrou pa (2002), 
merging of globular clusters. They find scaling relations that 
imply a mass radius relation 

r c /~M° 38 , (34) 

which would imply, for example, that the surface density of 
massive clusters would increase with increasing mass. This 
would appear to be ruled out for clusters with M c \ > 3 x 
10 6 M Q . 

Globular clusters have mass to light ratios Ty sa 2, consis- 
tent with the notion that they contain no non-baryonic dark 
matter. Models for massive clusters that merge globulars will 
not naturally trap large masses of dark matter, so the merger 
products will also lack dark matter, and have Ty w 2, smaller 
than the values observed for many of the UCDs shown in Fig- 
ure [6] 



Evstignee va et al.l (120071) find from spectroscopy that the 
ages, metallicities, and abundances of Virgo UCDs are sim- 
ilar to those of Virgo globular clusters. However, the mass to 
light ratios and the mass-radius relations differ. This is seen 
most clearly in the lower panel of their Figure 6, which is 
very roughly a plot of £ against mass (as in Figure [9] here). 
They conclude that the internal properties of Virgo UCDs are 
consistent with them "being the high-mass/high-luminosity 
extreme of known GC populations", apparently emphasizing 
the continuity of their distribution in the K2 - n\ (roughly the 
E-M c ;) fundamental plane rather than the clear break in the 
slope that they find. 

In this work the break in scaling properties is attributed 
to the emergence of radiation pressure as a dynamically sig- 
nificant element in the formation of the cluster. This result 
strengthens the case for treating globular clusters and UCDs 
on a unified footing. 

Mie ske & Kroupal (120081) note the high values of Ty found 
for UCDs, and suggest that it is due to a non-standard IMF. 
Their explanation is that the IMF is bottom-heavy, i.e., that 
there are more low mass stars per unit stellar mass than in the 
standard (Milky Way) IMF. It may appear paradoxical that 
high T values arise from both top heavy and from bottom 
heavy IMFs, but in both cases one appeals to an excess of 
under-luminous stars. In a top heavy IMF, at late times, the 
extra dead weight is found in stellar remnants, while in bottom 
heavy IMFs, the dead weight is fou nd in low mass stars (well 
below the main sequence turnoff). Mie ske & Kroupal (2008) 
suggest observations of CO at 2.3/im as a way to test for the 
presence of large numbers of low mass stars. 

Kles sen et al.1 (12007) suggest that in warm star-bursting cir- 
cumnuclear gas, as in the Galactic center, the IMF will be 
top heavy compared to the IMF rest of the Milky Way. They 
present numerical simulations showing that this is the case. 
They attribute the difference to the larger Jeans mass in the 
warm gas, as argued here. In this paper, we attribute the 
higher temperature to accretion of the cluster gas, as opposed 
to radiation from stars surrounding the proto-cluster; however, 
the crucial point is that higher temperatures in the ISM will 
lead to a top heavy IMF. 

6. CONCLUSIONS 

We have shown that the Jeans mass in a cluster is roughly 
independent of the cluster mass as long as the cluster is op- 
tically thin to the FIR, and assuming that the mass radius re- 
lation seen in embedded clusters reflects the primordial mass 
radius relation. Clusters forming today in the Milky War are 
optically thin, so this finding is consistent with the observed 
constancy of the IMF in our galaxy. 

Many Milky Way globular clusters were also optically thin, 
or had low enough velocities that their accretion luminosities 
would not have raised the gas temperature significantly. The 
Jeans mass in such clusters would then depend only on their 
radii at the time of formation. 

However, we have pointed out that many clusters in other 
galaxies, and in the Milky Way in the past, are or were opti- 
cally thick to the FIR; such clusters often (though not always) 
have high enough accretion luminosities that the gas temper- 
ature would have been higher than 10K. We went on to argue 
that the Jeans mass is larger than a solar mass in such clus- 
ters. Using the assumption that the break in the IMF is associ- 
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ated with the Jeans mass, we concluded that massive clusters 
should have high L/M ratios when young, and high M/L ra- 
tios when more than a few Gyrs old. 

The prediction of high L/M is consistent with observations 
of apparently enhanced light to mass ratios in the superstar 
clusters M82-F and M82-1 1, although the optically thick clus- 
ters M82-9 and NGC 4038:W99-15 appear to have normal 
M/L ratios. Other superstar clusters in M82 and other nearby 
galaxies are optically thin, and so they should have normal 
IMFs. Clearly, more data on superstar clusters would be help- 
ful. 

A top heavy IMF, which we showed should occur in the 
most massive and compact globular clusters, will tend to pro- 
duce an excess of LMXBs and pulsars in metal rich globular 
clusters in both the Milky Way and in nearby galaxies, and at 
the same time, a higher M/L ratio. 

Compelling support for the notion of a high mass to light 
ratio in massive clusters is supplied by Figure [6] which com- 
pares the predicted and observed mass to light ratio for objects 
with masses ranging up to 1O 8 M . As the solid lines in the 
Figure indicate, this result is consistent with the prediction 
of a top-heavy IMF resulting from the elevated temperatures 
associated with accretion in an optically thick environment. 

We also argued that in the most massive clusters, with 
M c i 3 x 10 6 M Q , the substantial luminosity associated with 



the contraction of the cluster must have been dynamically im- 
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portant. The predicted mass-radius relation, r c \ ~ M , leads 
to a number of relations between the properties of massive star 
clusters, involving the cluster velocity dispersion, the cluster 
luminosity, and the surface density; for example, we found 
Ly ~ cr 4 '. These relations appear to be consistent with the ob- 
served properties of massive star clusters. 

The combination of strong evidence for a cluster-mass de- 
pendent mass to light ratio and for a mass-radius relation 

3/5 

r c i ~ MJ provides solid support for the idea that a contrac- 
tion powered radiation field has left its mark on the most mas- 
sive star clusters we see around us. 

It is fitting to close by emphasizing again the striking fact 
that globular clusters are observed to have radii of 2 - 3 par- 
sees, with only weak dependence on cluster mass over a range 
1O 4 M to 3 x 10 6 M Q . The origin of this (lack of a) mass- 
radius relation remains a major puzzle. 
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APPENDIX 



LOG-NORMAL PROBABILITY DENSITY FUNCTIONS 



Both analytic theory and numerical simulations of turbulent flows find that the density follows a log-normal probability density 
function. While the interpretation of observations is difficult, there are indications that gas in star forming regions in the Milky 
Way does as well. In this appendix we show that this fact implies that the characteristic density p m of the gas out of which stars 
in star clusters form is no more than a factor of ~ 10 larger than the mean density of the protocluster. 

Following lOstriker Stone & Gammie (2001), let y = log(p/p), where the logarithm is base 10 and p is the mean density of the 
cluster. Then the probability density function is 

f M {y) = -jL= exp [-(y-pf/(2a 2 )] . (Al) 

The quantity fia(y)dy is the fraction of the cluster mass with density contrast y in the interval (y,y+dy). Demanding conservation 
of mass (in the form of the continuity equation) leads to a relation between the mean p and the dispersion a of y: 

M =iln(10)a 2 . (A2) 



Letting 

t = (y+p)/V2a, (A3) 

the fraction of the cluster mass having p > p m is 

fg(> Pm) = -erfc(* m ). (A4) 

FigureflOlplots both /m(p/ p) and f g (> p m )- We have taken p = 0.4; in the simulations of lOstriker Stone & Gammiel ( l2001l) this 
corresponds to a fast magnetosonic Mach number Mf ~ 3. 

If star formation occurs on a crossing time, the probability density function is sampled once before the protocluster gas is 
dissipated. During this time a fraction between 0. 1 and 0.3 of the gas must form stars. These fractions are indicated by the upper 
two horizontal dotted lines in the Figure. The corresponding over densities are 5 and 14 times the mean density of the clump. 
Under the extreme assumption that only the densest gas in the protocluster forms stars, the Jeans mass of the relevant gas is factor 
between two and four smaller than the mean Jeans mass of the clump. 

If star formation takes more than a single crossing time, the probability density is sampled more than once, so a larger fraction 
of the gas may be turbulently compressed to high density. The bottom dotted line corresponds to star formation in a cluster in 
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FIG. 10. — The solid line shows the gas fraction f g (> p/p) from eqn. IA4I Also shown is the probability density function fmiy) = dM/M to t (dashed line) 
for p = (\ogp/p)u = 0.4 (corresponding to hA F sd 3). The solid vertical line is at the mass-weighted mean density p. The horizontal dotted lines are drawn 
assuming that f g = SFE, i.e., that only the densest gas in the cluster eventually ends up in stars, for SFE = 0.3, 0.1, and 0.1/5. The value last assumes that the 
stars form over 5 dynamical times, so that /m is sampled five times. 

which SFE = 0.1 and for which star formation persisted for five dynamical times. We see that the characteristic density for this 
case is a factor 40 larger than the mean density, resulting in a Jeans mass 6.3 times smaller than the Jeans mass calculated using 
the mean density. 

These estimates for the Jeans mass are likely underestim ates; many self-gravi t ating bodies of gas with lower density but larger 
sizes will be produced by the turbulence, as argued by, e.g., Pad oan & Nord lund (2002). In fact the argument made here is similar 
to that made by those authors; both rely on the steep decline in the PDF with increasing density. Here we are emphasizing the 
observational constraints provided by the SFE and estimates of the duration of star formation rather than the dynamical constraint 
that stars form out of self gravitating material. 



LUMINOSITY FUNCTIONS AND INITIAL MASS FUNCTIONS 

The number of stars in a cluster having a given mass is described by the initial mass function (IMF), <fi(m)dm, the number of 
stars with masses between m and m + dm, where m is measured in solar masses M Q = 2 x 10 33 g. The IMF is normalized so that 

i-mv 

I m<fi(m)dm = 1 , (Bl) 

J l"L 

where Ml and mjj are lower and upper mass limits. We sometimes employ the Salpeter mass function, which for Ml = 0.1, 
mu = 100, is 4>sai = 0.17m" Q with a = 2.35. We also use the observed IMF for the Orion Nebula, given in Muench et al. (2002); 
for simplicity we set <p(m) = for masses below 0.025M Q . 
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.jy I m 2 21 my > m > mi 

hu m (m) = — = <| 'W) 1 " ? 1 ™: 1 .' 5 m , > m > m 2 (B2) 



m 1.15-2.21 m 0.27-1.15 m -0.27 W2 > m > mL 



iMuench et alJ d2002) found mi =O.6M andm 2 = O.12M . The high mass end of the IMF is described by a power law with index 
a = 2.21, similar to that of t he Salpeter IMF. Not e that we use dN /dm rather than dN /dlogm, so the exponents that appear here 
are equal to those quoted in (IMuench et al.ll2002l) minus one, i.e., their exponent -1 .21 becomes our exponent -2.21, while their 
+0.73 becomes -0.27 here. 

The Muench et al. IMF is similar to that of Kroupa) (1200 ll) . which has a high mass slope of a = 2.3 and a low mass slope 
a = 1.3, slightly steeper than that of Muench et al. 



The light to mass ratio 

We use the Padova stellar evolution tracks dBertelli et all (11994 : iGirardi et al.l (1 19961) : Girardi 2006, http://pleiadi.pd.astro.it) 
to find the luminosity of stars of mass m at an age of w 2.5 Myrs. For stars with m < 9 we use the isochrone for 4 Myrs; these stars 
contribute very little luminosity, so the error involved in doing so is minor. For more massive stars we searched the evolutionary 
tracks to find the age nearest 2.5 Myrs, since the luminosity of these stars varies rapidly with age. The result is L(m) in the form 
of a table. 

To find the light to mass ratio for stars at an age of 2.5Myrs, we integrate 

/ L(m)(j)(m)dm. (B3) 

J Ml 

For a Salpeter IMF with m^ = 0.1 and my = 120M Q (the estimated initial mass of 77 Carina) the light to mass ratio is 2010 in cgs 
units. Using the Muench et al. (2002) IMF with the mi and my, mi = 0.1 and nt\ = 0.6 the ratio is 4470. Part of this difference 
is due to the difference in slope at the high mass end of the two IMFs; using a slope of 2.21, the light to mass ratio of a simple 
powerlaw is 1790. The rest of the difference is due to what is effectively a low mass cutoff at m = 0.6 for the Muench IMF. This 
factor of two difference in the light to mass ratio for the different IMFs will lead to a factor of two difference in the predicted 
efficiency of star formation. 

In the main text we argue that m\ is set by the Jeans mass, and that the latter varies with cluster mass. Figure [3] shows the light 
to mass ratio for a Muench et al. IMF as a function of mi, keeping mi, my, and m 2 fixed at their original values. 



RADIATION PRESSURE SUPPORTED CLUSTER SIZES 



While there is strong observational support for a characteristic globular cluster size, we are not aware of a good physical 
explanation for it, nor do we give one here. Instead, in this appendix we investigate the possibility that for massive clusters, the 
cluster length scale is determined by the interplay between the self-gravity of a clump of gas, the turbulent pressure pv\, and in 
very dense and massive systems, radiation pressure. The turbulent pressure is 

Pturb = Pv\- (CI) 

The turbulent velocity is well above the sound speed in the clusters under consideration here, so we expect that it will decay 
on the dynamical time of the cluster. The turbulent pressure therefor cannot halt the collapse of the protocluster, but should act 
to slow it. We assume that as the collapse proceeds, the gravitational binding energy released is converted into new turbulent 
motions, which then shock and dissipate their energy as heat. This heat is converted into thermal radiation, which then diffuses 
out of the cluster. 

To model this, we assume that the turbulent pressure is a fixed fraction of the dynamical pressure, or, in terms of accelerations 

a tU rb = (i--V\a g ra V \ =(1~7) ^2 ' ( - C2 - ) 

We usually take 7 0.2; choosing different values leads to different cluster radii and Jeans masses, when the protocluster is 
radiatively supported. 

The thermal radiation will diffuse out of the protocluster, but before it does it provides a pressure opposed to that of gravity: 

Prad=^T\ (C3) 

where a = 4-ob jc and as is the Stefan-Boltzman constant. In the optically thick limit one can model this as a diffusion process, 
so that 

KKgWM (C4) 

where n(T,p,Z) is the Rossland mean opacity for the appropriate temperature, density, and metallicity. In our crude models we 
assume that the luminosity is roughly constant, independent of radius, as would be the case for a flow that is near free fall. 
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It is easy to show that for these massive clusters the gas pressure is negligible compared to the radiation pressure; one way to 
see this is note that the turbulent motions are highly supersonic. 

The momentum equation for an infalling shell is 

dv GM{r) kL 
at r L 4-Trr z c 

The luminosity is given by 

GMM 

L=— j-v(r). (C6) 



r- 

Using this we can write 

dv C,M(r\ r k M(r\ 1 

(C7) 



dv __GM(r) 
dt r 2 



nM(r) 
^ c 4irr 2 



The first term on the right hand side of this equation is the effective (turbulence reduced) acceleration due to gravity, while the 
second term is the acceleration due to radiation pressure. The radiation term is proportional to the rate of collapse. 

In our numerical work we start the integration at a large radius, where the second term is negligible. We stop the integration 
when the radiation pressure slows the collapse by 20-30% compared to the case with no radiation pressure. More exactly, if 
v g = GM c i Jr, we define r g = r/v g and Td yn = r/v. We stop the calculation when t/t s > 1.2 or 1.3. Using the approximation 
that v as y/2^GM{r)/r, this occurs at a radius 

rrad = 4>radG^ 5 (£^)'' ' M^ 5 , (C8) 

where (j) rn d is a dimensionless constant; for our simple model, 4>rad ~ 3. 

The rational for stopping the integration is that, when the dynamical time Td yn exceeds the gravitational crossing time r,, the 
cluster is partially supported by radiation pressure. The rate at which gravitational binding energy is converted into turbulent 
motion is reduced, so the level of turbulent pressure support will drop; the turbulent dissipation time will remain comparable to 
Tgrav, so the turbulence will decay more rapidly than the cluster shrinks. We assume that as the turbulence decays, star formation 
will proceed at a rapid pace, and the cluster radius will be frozen in at or about the radius at which radiation pressure support 
becomes important. 

NUMERICAL ESTIMATES OF DARK MATTER DENSITIES 

In section j34.2.2l we argued by reference to the properties of known dark matter-dominated objects (Milky Way satellite dwarf 
galaxies) that dark matter densities on the scale of lOpc were pdm ~ 10~ 21 gem" 3 , much less than the densities of massive compact 
star clusters (p ~ 10~ 19 gcm~ 3 : see Fig.[7]>. In this appendix we use the results of numerical calculations to estimate the maximum 
density on small scales (iNavarro et alJI 19971: iBullock et alJl200UlDiemand et al.ll2007HMadau et al]l2008h . 

It is traditional to use the concentration c vir = r vir / r s , where 

_( Mvir V /3 ml . 
rvir =\A~A (D1) 

is the virial radius of a dark matter halo, and r s is the characteristic radius of the density distribution, introduced in equations ( l32t 
and d33l . In this expression p cru = 3Hq/(8ttG) ~ 9 x 10~ 30 gcm~ 3 is the critical density of the universe and A VJ > w 200, depending 
on the cosmology, e.g., A,,,> =180 for an Einstein-de Sitter cosmology. 

We consider an NFW profile; results for Bu llock et al.1 (1200 ll) are similar. The mean density of an NFW halo at radius r is 

Pin = 3\. ir Pcri, [ — J -r, r , (D2) 

where 



A(x) = \a(l+x)-- . (D3) 

1 +x 



In the limit that r << r s this is 

,2 / M. \l/3 



m - 1500-^ (^-] ^ (^) A virPcrit , (D4) 



A(c vir ) \lQ s M e ) \ r J 

where we have scaled to the minimum virial mass for a cluster baryon mass of 10 7 M Q . The value of c vir w 10 f or a Milky Way 
size ha lo ( M vi > sa 10 l2 Mp)), lea ding to an estimate of p(10pc) «4x 10~ 21 gem" 3 . From the results reported in iDiemand et al.l 
(2007) and Ma dau et al.l (120081) . the concentration c V! > < 50 for M V! > w 10 8 M Q for subhalos near the center of their parent halo 
(note that both Fornax and Virgo UCDs are close to the centers of their parent halos). For these low mass compact halos, we find 
^(lOpc) k3x 10~ 21 gem" 3 , slightly higher than the observed densities of the Milky Way satellite galaxies (when extrapolated to 
r = lOpc), but not nearly high enough to be dynamically important in the massive compact clusters and UCDs. 
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We note that the numerical calculations performed to date a re not of high enough resolution to measure the density on scales of 
lOpc. The highest resolution study published to date, that of iDiemand et al.l (|2007), has a force resolution of ~ 90 pc, sufficient 
to resolve clusters ~ 300pc in radius. Since we have no theoretical understanding of the density profiles found in the simulations, 
the extrapolation from 300 pc to lOpc we use to estimate the dark matter densities is on shaky ground. For this reason we prefer 
the direct comparison with objects such as Willman I, where the dark matter density is measured directly on the relevant scale. 
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